Skip to content

LFM2 parser#4085

Open
przepeck wants to merge 38 commits intomainfrom
przepeck/lfm2_parser
Open

LFM2 parser#4085
przepeck wants to merge 38 commits intomainfrom
przepeck/lfm2_parser

Conversation

@przepeck
Copy link
Copy Markdown
Collaborator

🛠 Summary

CVS-182500
Adding toolp parser for LFM2 model

🧪 Checklist

  • Unit tests added.
  • The documentation updated.
  • Change follows security best practices.
    ``

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds LFM2 tool-call parsing support to OVMS LLM I/O processing, wiring it into the existing OutputParser tool-parser selection and providing a dedicated LFM2 output parser test suite.

Changes:

  • Introduced Lfm2ToolParser (unary + streaming parsing) under src/llm/io_processing/lfm2/.
  • Registered the new parser in OutputParser (parser name "lfm2") and Bazel build targets.
  • Added comprehensive LFM2 output parser tests (including streaming scenarios and malformed inputs).

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 13 comments.

Show a summary per file
File Description
src/test/llm/output_parsers/lfm2_output_parser_test.cpp New gtest coverage for LFM2 tool-call parsing (unary + streaming).
src/llm/io_processing/output_parser.cpp Registers "lfm2" tool parser and includes its header.
src/llm/io_processing/lfm2/tool_parser.hpp Declares Lfm2ToolParser state machine and parsing helpers.
src/llm/io_processing/lfm2/tool_parser.cpp Implements LFM2 unary + streaming parsing and argument normalization.
src/llm/BUILD Adds Bazel library target for the LFM2 parser and links it into output_parsers.

Comment on lines +288 to +313
std::optional<rapidjson::Document> Lfm2ToolParser::parseChunk(const std::string& chunk, ov::genai::GenerationFinishReason finishReason) {
if (chunk.empty()) {
return std::nullopt;
}

this->streamingContent += chunk;

if (parseNewContent()) {
if (this->currentState == State::ToolCallParameters) {
return BaseOutputParser::wrapFirstDelta(this->toolCall.name, toolCallIndex);
}
if (this->currentState == State::ToolCallEnded) {
return wrapDeltaArgs(this->toolCall.arguments, toolCallIndex);
}
if (this->currentState == State::Content) {
auto content = this->streamingContent.substr(this->streamingPosition);
this->streamingPosition += content.size();
return wrapDeltaContent(content);
}
if (this->currentState == State::AfterToolCall) {
this->currentState = State::Content;
}
}

return std::nullopt;
}
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Streaming: streamingContent is only appended to and never trimmed, while streamingPosition monotonically increases. For long responses this can grow without bound and increase substring/search costs over time. Consider erasing the processed prefix (or keeping only an unprocessed tail) once streamingPosition moves forward, to keep memory and time bounded.

Copilot uses AI. Check for mistakes.
@dtrawins dtrawins requested a review from atobiszei March 25, 2026 13:15
przepeck and others added 5 commits March 25, 2026 14:52
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
const std::string Lfm2ToolParser::TOOL_ARGS_END_INDICATOR = ")";
const std::string Lfm2ToolParser::TOOL_SEPARATOR_STR = ", ";

void Lfm2ToolParser::writeArgumentOfAnyType(const rapidjson::Value& arg, rapidjson::Writer<rapidjson::StringBuffer>& writer) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that's a standalone function could it be static?
Also it sounds quite generic. Can we extract it to utils? Not sure if we already have such file in io_processing...

}
writer.EndObject();
} else {
writer.String("");
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we write anything if logging error?

Copy link
Copy Markdown
Collaborator Author

@przepeck przepeck Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in other case the key will be left like this {key: bad_value, another_key: value} -> {"key", "another_key":"value"}

const char last = normalized.back();
if ((first == '{' && last == '}') || (first == '[' && last == ']')) {
std::replace(normalized.begin(), normalized.end(), '\'', '"');
SPDLOG_LOGGER_TRACE(llm_calculator_logger, "Argument contains curly braces or square brackets, replaced single quotes with double quotes for JSON parsing. Modified string: {}", normalized);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it always safe? What if we get argument with list of strings:

["hello", "it's me"] 

?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's right, I will think how can I change it

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed it and tested with tests ParseToolCallWithStringArgumentsArrayWithStringsContainingQuotes and ParseToolCallWithStringArgumentsObjectWithStringsContainingQuotes

size_t pos = 0;
int main_guard = 0;

while (pos != std::string::npos && main_guard < MAX_TOOL_CALLS) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this guard really needed?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This parser will be used with models from 2.6B to 24B, if it would use 2.6B for multiturn it gets confused and generates tool calls and doesn't follow it's usual template. To avoid infinity loops I would live it

size_t pos = this->streamingContent.find(TOOL_CALL_START_TAG, this->streamingPosition);
size_t toolCallEndTagPos = this->streamingContent.find(TOOL_CALL_END_TAG, this->streamingPosition);
if (toolCallEndTagPos != std::string::npos && pos == std::string::npos) {
SPDLOG_LOGGER_TRACE(llm_calculator_logger, "Detected end of tool call at position: {}", toolCallEndTagPos);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in that point we know that tools were generated? Won't it catch end of tool even if model did not output any tools but for example just and array of numbers?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants